Eukaryotic chromosome fine structure
Eukaryotic chromosome fine structure refers to the structure of sequences for eukaryotic chromosomes. Some fine sequences are included in more than one class, so the classification listed is not intended to be completely separate.
Chromosomal characteristics
Some sequences are required for a properly functioning chromosome:
- Centromere: Used during cell division as the attachment point for the spindle fibers.
- Telomere: Used to maintain chromosomal integrity by capping off the ends of the linear chromosomes. This region is a microsatellite, but its function is more specific than a simple tandem repeat.
Throughout the eukaryotic kingdom, the overall structure of chromosome ends is conserved and is characterized by the telomeric tract - a series of short G-rich repeats. This is succeeded by an extensive subtelomeric region consisting of various types and lengths of repeats - the telomere associated sequences (TAS).[1] These regions are generally low in gene density, low in transcription, low in recombination, late replicating, are involved in protecting the end from degradation and end-to-end fusions and in completing replication. The subtelomeric repeats can rescue chromosome ends when telomerase fails, buffer subtelomerically located genes against transcriptional silencing and protect the genome from deleterious rearrangements due to ectopic recombination. They may also be involved in fillers for increasing chromosome size to some minimum threshold level necessary for chromosome stability; act as barriers against transcriptional silencing; provide a location for the adaptive amplification of genes; and be involved in secondary mechanism of telomere maintenance via recombination when telomerase activity is absent.
Structural sequences
Other sequences are used in replication or during interphase with the physical structure of the chromosome.
- Ori, or Origin: Origins of replication.
- MAR: Matrix attachment regions, where the DNA attaches to the nuclear matrix.
Protein-coding genes
Regions of the genome with protein-coding genes include several elements:
- Enhancer regions (normally up to a few thousand basepairs upstream of transcription).
- Promoter regions (normally less than a couple of hundred basepairs upstream of transcription) include elements such as the TATA and CAAT boxes, GC elements, an initiator, etc.
- Exons are the part of the transcript that will eventually be transported to the cytoplasm for translation. When discussing gene with alternate splicing, an exon is a portion of the transcript that could be translated, given the correct splicing conditions. The exons can be divided into three parts
- The coding region is the portion of the mRNA that will eventually be translated.
- Upstream untranslated region (5' UTR) can serve several functions, including mRNA transport, and initiation of translation (including, portions of the Kozak sequence). They are never translated into the protein (excepting various mutations).
- The 3' region downstream from the stop codon is separated into two parts:
- 3' UTR is never translated, but serves to add mRNA stability. It is also the attachment site for the poly-A tail. The poly-A tail is used in the initiation of translation and also seems to have an effect on the long-term stability (aging) of the mRNA.
- An unnamed region after the poly-A tail, but before the actual site for transcription termination, is spliced off during transcription, and so does not become part of the 3' UTR. Its function, if any, is unknown.
- Introns are intervening sequences between the exons that are never translated. Some sequences inside introns function as miRNA, and there are even some cases of small genes residing completely within the intron of a large gene. For some genes (such as the antibody genes), internal control regions are found inside introns. These situations, however, are treated as exceptions.
Genes that are used as RNA
Many regions of the DNA are transcribed with RNA as the functional form:
Other RNAs are transcribed and not translated, but have undiscovered functions.
Repeated sequences
Repeated sequences are of two basic types: unique sequences that are repeated in one area; and repeated sequences that are interspersed throughout the genome.
Satellites
Satellites are unique sequences that are repeated in tandem in one area. Depending on the length of the repeat, they are classified as either:
Interspersed sequences
Interspersed sequences are tandem repeats, with sequences that are found interspersed across the genome. They can be classified based on the length of the repeat as:
- SINE: Short interspersed sequences. The repeats are normally a few hundred base pairs in length. These sequences constitute about 13% of the human genome[2] with the specific Alu sequence accounting for about 10%.
- LINE: Long interspersed sequences. The repeats are normally several thousand base pairs in length. These sequences constitute about 21% of the human genome.[2]
Both of these types are classified as retrotransposons.
Retrotransposons
Retrotransposons are sequences in the DNA that are the result of retrotransposition of RNA. LINEs and SINEs are examples where the sequences are repeats, but there are non-repeated sequences that can also be retrotransposons.
Other sequences
Typical eukaryotic chromosomes contain much more DNA than is classified in the categories above. The DNA may be used as spacing, or have other as-yet-unknown function. Or, they may simply be random sequences of no consequence.
References
Notes
- ^ Pryde FE, Gorham HC, Louis EJ (1997) Chromosome ends: all the same under their caps. Curr Opin Genet Dev 7(6):822-828
- ^ a b Pierce, B. A. (2005). Genetics: A conceptual approach. Freeman. Page 311